Static Malware Detection Using Stacked BiLSTM and GPT-2

نویسندگان

چکیده

In recent years, cyber threats and malicious software attacks have been escalated on various platforms. Therefore, it has become essential to develop automated machine learning methods for defending against malware. the present study, we propose stacked bidirectional long short-term memory (Stacked BiLSTM) generative pre-trained transformer based (GPT-2) deep language models detecting code. We developed using assembly instructions extracted from .text sections of benign Portable Executable (PE) files. treated each instruction as a sentence section document. also labeled document or malicious, according file source. created three datasets those sentences documents. The first dataset, composed documents, was fed into Document Level Analysis Model (DLAM) Stacked BiLSTM. second sentences, used in Sentence Models (SLAMs) BiLSTM DistilBERT, GPT-2 Domain Specific Language (GPT2- DSLM), General (GPT2-GLM). Lastly, merged all without labels creating third dataset; then custom model with it. compared malware detection performances. results showed that improved GPT2-DSLM GPT2-GLM performance. experiments DLAM, SLAM GPT2-DSLM, achieved 98.3%, 70.4%, 86.0%, 76.2% F1 scores, respectively.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Malware Detection through Low-level Features and Stacked Denoising Autoencoders

In recent years, the diffusion of malicious software through various channels has gained the request for intelligent techniques capable of timely detecting new malware spread. In this work, we focus on the application of Deep Learning methods for malware detection, by evaluating their effectiveness when malware are represented by high-level, and lowlevel features respectively. Experimental resu...

متن کامل

A Static Malware Detection System Using Data Mining Methods

A serious threat today is malicious executables. It is designed to damage computer system and some of them spread over network without the knowledge of the owner using the system. Two approaches have been derived for it i.e. Signature Based Detection and Heuristic Based Detection. These approaches performed well against known malicious programs but cannot catch the new malicious programs. Diffe...

متن کامل

Malware Detection using Classification of Variable-Length Sequences

In this paper, a novel method based on the graph is proposed to classify the sequence of variable length as feature extraction. The proposed method overcomes the problems of the traditional graph with variable length of data, without fixing length of sequences, by determining the most frequent instructions and insertion the rest of instructions on the set of “other”, save speed and memory. Acco...

متن کامل

Mac Malware Detection via Static File Structure Analysis

It is widely acknowledged in the security community that the current signature-based approach to virus detection is no longer adequate. More recently, antivirus software has been doing dynamic malicious behavior detection. While this is more effective, it is computationally expensive, so they cannot do very much of it or the performance of the user’s computer will suffer. Static executable anal...

متن کامل

Using IRP for Malware Detection

Run-time malware detection strategies are efficient and robust, which get more and more attention. In this paper, we use I/O Request Package (IRP) sequences for malware detection. N-gram will be used to analyze IRP sequences for feature extraction. Integrated use of Negative Selection Algorithm (NSA) and Positive Selection Algorithm (PSA), we get more than 96% true positive rate and 0% false po...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2022

ISSN: ['2169-3536']

DOI: https://doi.org/10.1109/access.2022.3179384